Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move PAM generation to build time, optimizing download size #157

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

LuigiPiucco
Copy link

Although from the titles alone it may seem unrelated to the issue, the original goal was to fix #59. The approach taken, as suggested in that thread, is to generate the Peripheral Access Modules (PAMs) after the user has downloaded the package, in an automated fashion. So, instead of shipping the modules generated "by hand" (with semi-automation actually, but still), we ship the ATDF sources from which they are generated. When running build either in the crate root or as a dependency, the build.rs script outputs a module for the selected MCU into a known path. The generation steps are described in the README.

The result of this is that the packaged .crate file does become much smaller, just short of 5 times (4.875...). Compilation time increases, of course, in particular as a fault of the build script. This could be improved by using the executable versions of svd2rust, svdtools and atdf2svd, but that would greatly reduce reproducibility. rustfmt is used via the executable, since it does not provide a library API, but in that case it should be fine, since anyone with a toolchain capable of compiling to AVR probably also has access to rustfmt.

The commit changing the example from a crate to an actual Cargo example, while technically not necessary, makes testing the use of the crate as a dependency more convenient.

For a reference of crate size, see #59 (comment).

@LuigiPiucco
Copy link
Author

@Rahix just a reminder about this.

@LuigiPiucco
Copy link
Author

Is there still interest in this PR? Not to rush, but I'd like to know, since if there's not or not yet, I'll hold off on fixing CI. It's failing because this automated approach doesn't need the Makefile, and the current CI code expects it to be there.

@LuigiPiucco LuigiPiucco force-pushed the size-opt branch 2 times, most recently from b109bb2 to 0d3187c Compare January 8, 2025 22:33
@tones111
Copy link
Contributor

tones111 commented Jan 9, 2025

Looking over the patch series it looks like it's trying to accomplish multiple independent efforts. As a diff increases in size or impact it becomes much harder to review.

If the primary motivation here is to transition from the makefile to build.rs then I would suggest trying to move anything unrelated into separate pull requests. For example, the first commit in the series ("refactor patches ...") doesn't appear to support the build.rs migration and likely adds unrelated complexity. I have the same concern for the second and third patches in the series too.

@LuigiPiucco
Copy link
Author

LuigiPiucco commented Jan 9, 2025

The focus was on the build.rs transition, and the earlier commits help with that, but I could split them out. Here's my reasoning for the current series, and how I plan to shorten the diff:

  • The first one is leftover from updating svd2rust and svdtools while doing this. I thought it would be less work to update them before implementing build.rs, because updating afterward would require revisiting the build logic too. An old commit message referenced this, I think. Regardless, I can probably remove it with some changes to commit 4;
  • The second one is really independent, it makes testing easier but can be removed with a minor change to commit 5;
  • The third one is relevant because the implementation prior to it required a vector module for the interrupt macro, which I had trouble generating from the build script, and it's also needed to allow more than 1 MCU to be built. The module requirement has since been removed in the cortex-m crate, which the implementation is based on, so I pulled that in here. Removing this one would enlarge commit 4 with extra code to generate the module plus something other solution to allow multi-MCU builds, which doesn't seem beneficial. I think the commit message could use a better explanation though;
  • The fourth one is the main bit of this PR, I don't think I can remove anything without breaking it. Maybe this piece could be hoisted to here though, it may shorten the diff a little;
  • The fifth one already seems OK in size.

I'll wait to hear your opinions on what I propose above before changing anything though.

@Rahix
Copy link
Owner

Rahix commented Jan 9, 2025

From my side, I wouldn't worry too much about the total diff of the PR as long as the commits stay clean and logical. IIRC you're one to take this serious @LuigiPiucco, so I have no concerns :)

To give a few short comments on the approach:

  • Your first commit, dd32b3b, is then mostly related to Update to svd2rust 0.33.1 #155. I guess we can do the upgrade as layed out in that PR but I am not sure if we should also pull in the naming changes from svd2rust just yet. I'd for sure like to keep this separate to the topic here, though. So my main question would be: Do you need Update to svd2rust 0.33.1 #155 for the newer svd2rust version in any case or can we stick to the current one for the time being?
  • I would push your second commit, ff31dc0, to the end of the series and then actually have the example use the in-repository version of avr-device directly via path dependency. The rather ugly current solution of the example relying on the latest published version from crates.io was only put in place due to the complicated codegen setup.
  • The third commit, 8a588f8, is very important to discuss. I took a look a the changes and right now I don't quite understand how you do the translation from interrupt name to vector name? The reason we needed the vector module so far is that we have to translate the interrupt names into their corresponding vectors. I couldn't find where this translation is happening in your code.
  • Remainder looks fine at a quick glance, let's do a more in-depth review when the time comes.

To be completely honest, I am still on the edge about this build-time approach. I do see the benefits but I also think there is a good reason why most other embedded Rust platforms have opted to publish pre-generated crates as we currently do.

That said, I am not opposed to going this route. Mainly because the current setup is very much hitting its limits. We'll be forced to split avr-device into multiple crates soon if we continue pre-generating the code. Build-time generation would solve this, at least until the amount of vendor-ATDFs explodes dramatically :)

The biggest fear I have is build-time. For downstream users, I assume the change to be neglible because they will only incur the additional time on first build. But where it will really hurt is in CI environments like the one avr-hal currently uses. I assume we will need to start caching build-artifacts there to cope with build-time codegen of avr-device. Maybe as an experiment, you could set up an avr-hal PR which uses your version of build-time generated avr-device as a dependency so we can then compare the CI times?

That's my two cents on the topic, I am very sorry it took this much time for me to get back to you. In any case, this is impressive work, thanks a lot for working on this and demonstrating that build-time codegen isn't as unrealistic as I had imagined so far!

@Rahix
Copy link
Owner

Rahix commented Jan 9, 2025

A few more comments:

This could be improved by using the executable versions of svd2rust, svdtools and atdf2svd, but that would greatly reduce reproducibility

I am strongly against using executables here. The current way of interfacing via the tools as pure-rust dependencies is what we need to be doing.

since anyone with a toolchain capable of compiling to AVR probably also has access to rustfmt

rustfmt is an optional toolchain component, we cannot rely on it being available. We have to decide between

  1. Using rusfmt optionally when available and skipping formatting when it is missing.
  2. Finding another way to interact with rustfmt or another rust formatter as a rust dependency. I'm sure we are not the first to hit this problem, maybe there is some alternative available

@tones111
Copy link
Contributor

tones111 commented Jan 9, 2025

I started experimenting with a parallel implementation today and (like you) ran into some accidental complexity patching the svd file. I created svdtools #265 which, if implemented, could simplify our implementation by not having to re-write the top level patch files in a temp directory.

I'm still pretty new to SVD patching, but I think it should be doable.

@LuigiPiucco
Copy link
Author

The third commit, 8a588f8, is very important to discuss. I took a look a the changes and right now I don't quite understand how you do the translation from interrupt name to vector name? The reason we needed the vector module so far is that we have to translate the interrupt names into their corresponding vectors. I couldn't find where this translation is happening in your code.

I revisited the cortex-m-rt crate, and it has more than just the the macro to implement its interrupts. This was indeed an oversight on my part. Their approach is that the macro doesn't convert the names to libc's __vector_N symbols, it just generates code to guarantee the interrupt exists and renames it to be reachable only from the linker script. This is what my commit does, but it doesn't bring in a linker script and a vector table like it would need. Doing it like this could allow us to fix #76, but that's a separate issue for later. For now, generating the vector module like before is not viable with this approach, and the only thing I can think of is to generate a linker script with PROVIDE statements from the build.rs. Would that be OK?

The issue with the module is that it needs to be present when the macros compile, and they compile before anything is generated by build.rs. We'd need to replicate a build.rs inside the macros crate, which would get messy really quick.

About the other comments, I'll reply later, I already spent quite a while trying to find an alternative to the above and it's getting late.

@Rahix
Copy link
Owner

Rahix commented Jan 10, 2025

Custom linker scripts will be a part of tackling #76, I don't think we should separate these two. The linker scripts need to be synchronized with the libc runtime, I'd fear breaking things if we just blindly substitute them.

I agree that generating the vectors from build.rs is tricky... The only idea I have left is generating a hidden macro in the main avr-device crate that does the translation. The #[interrupt] macro then only generates an invocation of that hidden macro. Like this:

     quote!(
         #(#cfgs)*
         #(#attrs)*
-        #[allow(static_mut_refs)]
-        #[doc(hidden)]
-        #[export_name = #ident_s]
-        pub unsafe extern "avr-interrupt" fn #tramp_ident() {
-            #ident(
-                #(#resource_args),*
-            )
-        }
+        avr_device::__avr_interrupt_trampoline!{
+            #ident_s, #ident, #(#resource_args),*
+        }
 
         #[doc(hidden)]
         #f
     )

@LuigiPiucco
Copy link
Author

The only idea I have left is generating a hidden macro in the main avr-device crate that does the translation. The #[interrupt] macro then only generates an invocation of that hidden macro.

That does work, nice catch! I'll push my implementation after I clean it up, but I confirmed the ELF for the example includes a defined __vector_N symbol now, and the jump table changes to call it instead of __bad_interrupt.

rustfmt is an optional toolchain component, we cannot rely on it being available. We have to decide between

1. Using rusfmt optionally when available and skipping formatting when it is missing.

2. Finding another way to interact with rustfmt or another rust formatter as a rust dependency.  I'm sure we are not the first to hit this problem, maybe there is some alternative available

I don't think 2 can work due to rust-lang/rustfmt#5955. 1 seems reasonable, and another option is prettyplease, which seems even better than rustfmt for our use-case. The only issue I can see is that svd2rust only outputs a string, so we'd have to parse it with syn which may increase compile time, but that assumption needs testing.

The biggest fear I have is build-time.

In my quick testing, building with --all-features takes almost exactly 1 minute. On CI, the current run took 2 minutes on compiling (that includes everything down from core, though). I think that is OK for us, and the end-user won't see more than a couple of seconds with just 1 MCU enabled.

@Rahix
Copy link
Owner

Rahix commented Jan 10, 2025

Neat, prettyplease looks perfect for what we are doing here. Would be my favorite, then :)

@LuigiPiucco
Copy link
Author

About updating svd2rust, I need at least this commit, which is already version 0.31.3 (the option got renamed before publishing, it's actually skip_crate_attrs), so we will have to pull in the field -> method changes in this PR anyway. I think I can leave out the upstream casing-related changes, though we might as well get that done too.

build.rs Outdated Show resolved Hide resolved
@tones111
Copy link
Contributor

tones111 commented Jan 11, 2025

The first commit in the series (dd32b3b) generates an error for me. My tool versions match what's specified in the readme.

[2025-01-11T20:48:25Z ERROR svdtools::cli] by svdtools (0.4.0)

    Caused by:
        0: Processing device `ATmega8`
        1: According to `TC0`
        2: Processing peripheral `TC0`
        3: According to `TCCR0`
        4: Processing register `TCCR0`
        5: Modifying fields matched to `CS0`
        6: Could not find `TC0.TCCR0:CS0. Present fields: CS00, CS01, CS02.`

Update:
The SVD output of this commit seems questionable. For example, it looks like a lot of mcus (ex: atmega328p) no longer have an SPI_MASTER enumeral for the UMSEL field, but I see it listed in the datasheets. These discrepancies are still present when building from the head of this branch.

@LuigiPiucco
Copy link
Author

LuigiPiucco commented Jan 11, 2025

I had forgotten to update the README requirements, and there was a change that sneaked into a later commit while rebasing. Now it should work, though I haven't thoroughly tested building in between each commit.

Edit: I also forgot to push the changes. I will investigate the missing elements in the SVD as well.

LuigiPiucco and others added 5 commits January 12, 2025 13:12
The atmega328p example was corrected to svd2rust's new naming
convention, too.
This replaces the AoT processing we did to generate the Rust modules
locally. It generates them on the fly at build-time, and only for the
selected MCUs. The process is mostly the same, just automated, with the
addition of what is described in the next paragraph. Some things became
unnecessary though, such as the `modrs.patch` and `Makefile`, and
therefore were removed. `form` is no longer run, in order to minimize
the number of files and directories. The patches were updated to not
have the `_svd` key, since that's now handled by the build script. Those
that ended up empty were removed.

It also updates our `interrupt` macro, adapting it from a newer
iteration of `cortex-m-rt`'s and adding logic to make the vector module
unnecessary. It would be hard to generate it correctly for the macros
crate, since it compiles before the main one where the build logic is
hosted. Instead, we generate a macro `__avr_device_trampoline` in the
main crate, and `#[interrupt]` calls into that giving the MCU name (if
passed as an argument), interrupt name and trampoline item to define.
This new macro converts the interrupt's name into a `__vector_N` symbol,
which the linker understands as being an interrupt, and changes the
function's name to it with `#[export_name = "..."]`. The MCU argument to
the attribute macro becomes optional, if it's absent we call into
`__avr_device_trampoline_single`, which expands to
`__avr_device_trampoline` with the MCU filled in or a compiler_error if
there are more than one selected.

The build script also allows us to differentiate a build for a single
MCU vs. one for many (via `#[cfg(single_mcu)]`). This is useful in the
interrupt implementation above, and may be useful in avr-hal.

Co-authored-by: Rahix <[email protected]>
Co-authored-by: tones111 <[email protected]>
It was a separate crate previously, but that made its compilation
completely independent of the actual PAC crate. The user can still use
that directory as a template, they'll just need to rename
`Cargo.toml.example` into place. The changed steps for that and to test
the example are documented in `examples/atmega328p/README.md`. The
target used was updated from rustc, as well, and hoisted up as a default
for the whole crate. This allows to more easily build the
library/example for quick testing.
The temporary SVD files are now generated in $OUT_DIR/svd instead of a
temp directory, so that we can upload them for inspection as part of the
CI output.
This replaces the dependency on the rustfmt executable with an actual
tracked build dependency. Since we have to parse with syn anyway now, I
also took the opportunity to swap the string replacement for a
syntax-tree based replacement.
Copy link
Owner

@Rahix Rahix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, managed to dig through all the changes in this PR now. Please check my comments below, there are a few topics I'd still like to discuss.

Thank you so much for all your efforts here, I really like how it is coming together now.

Comment on lines +56 to +62
panic!(
"No MCU feature selected! \
The avr-device crate requires one to be enabled in order to \
know your target's peripherals. Currently available are:\n\
{}",
mcus.into_iter().collect::<Vec<_>>().join("\n"),
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use cargo::error= to emit the error message please.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about that instruction, will do.

println!("cargo::rerun-if-changed={}", atdf_file.display());
let patch_dir = crate_root.join("patch");
let patch_file = patch_dir.join(format!("{}.yaml", mcu));
println!("cargo::rerun-if-changed={}", patch_file.display());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not enough - you also have to rerun-if-changed all the included patch files as well. In the Makefile, we did that with the svdtools makedeps command.

Ok(chip) => chip,
Err(what) => {
let _ = what.format(&mut io::stdout());
panic!("Failed to parse \"{}\"!", atdf_file.display());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, please use cargo::error=.

let atdf_parsed = match atdf::parse(atdf_reader, &HashSet::<String>::new()) {
Ok(chip) => chip,
Err(what) => {
let _ = what.format(&mut io::stdout());
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors should be sent the stderr. But even better, I think we should emit the error towards cargo directly with cargo::error=?

Comment on lines +168 to +170
let crate_root = PathBuf::from(env::var_os("CARGO_MANIFEST_DIR").unwrap());
let atdf_file = crate_root.join("vendor").join(format!("{}.atdf", mcu));

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already construct this atdf_file path in build_mcu_module(), why not pass it to get_atdf_parsed()?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get your intention behing making this example a crate-example of avr-device. But I'm not sure if this is really a good idea.

My main hope for this example was that it is as straight-forward as possible to copy the examples/atmega328p/ subdirectory and start building your own project. To be honest, I do not really see a reason for this example to exist beyond that usecase.

By moving the example to become a crate-example, the individual bits and pieces needed for starting a new "bare avr-device" project are a lot more dispersed and it becomes harder to get started.

So I would be in favor of keeping things as they were here.

lto = true
opt-level = "s"

[profile.release]
panic = "abort"
codegen-units = 1
debug = true
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't remove this. Otherwise you don't have symbol info for release builds.

This setting has no influence on the on-flash size of the binary, if that was your motivation. The debug info is only stored in the .elf file.

@@ -1,10 +1,3 @@
/target/
/macros/target/
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't remove this. macros/ is not in a workspace so if you end up calling cargo build inside the directory, it will put its build artifacts into /macros/target/.

Comment on lines +16 to +22
println!("cargo::rustc-check-cfg=cfg(single_mcu)");

let available_mcus = get_available_mcus();
let mcus = select_mcus(available_mcus);
if mcus.len() == 1 {
println!("cargo::rustc-cfg=single_mcu");
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think special-casing single-MCU builds is a good idea. This violates cargo's "features are strictly additive" concept which we do want to uphold here in avr-device at least.

I assume your motivation is getting rid of the MCU name from the interrupt macro invocations, right? I do agree that this would be nice to have, but I don't think it is critical or related to the main topic of this PR.

So I'd rather not include this here and discuss the interrupt macro separately. In fact I think a different solution is needed instead of this feature-hacking approach. For example, we could consider something like what's done by cortex-m-rt, although I find their approach of abusing the overlap of macro and module namespaces rather cursed as well...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will go back to requiring the name for now.

@@ -22,7 +22,6 @@ USART?:
_replace_enum:
USART_ASYNC: [0, "Asynchronous USART"]
USART_SYNC: [1, "Synchronous USART"]
SPI_MASTER: [3, "Master SPI (MSPIM)"]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this about?

@LuigiPiucco
Copy link
Author

I'm in the process of stripping the PR to the bare minimum, but I've met some issues with versioning.

  1. Unsupported proc macro punctuation character while rendering device from svd rust-embedded/svd2rust#863 is only fixed in the latest version of svd2rust, and using recent proc-macro2 (which prettyplease requires) breaks the generation altogether on any version of svd2rust prior.
  2. cargo::error= requires a very recent compiler, 1.84 and upwards, so I'll need to drop it if keeping the compiler requirement is a goal.

@Rahix How do you prefer to proceed? If we keep the old version of svd2rust, prettyplease cannot be used to format the code, and we lock our users into also using old versions of proc-macro2 (plus I think also an old compiler). If we update it, it's likely we'll have to update the compiler, plus use the field/method changes and most liked the casing changes as well. Preventing panics in the build script is also only possible if we update de compiler.

@Rahix
Copy link
Owner

Rahix commented Jan 16, 2025

So this is huge chain of interdependencies that we can't easily resolve right now:

  1. We can't upgrade the compiler right now because of some pending regression fixes. See Upgrade rust toolchain to nightly-2025-01-01 avr-hal#585.
  2. As you probably noticed, we can't upgrade proc-macro2 because of the old compiler. Also see cargo build - error[E0658] - proc_macro::Literal::byte_character(byte) avr-hal#537.
  3. So being stuck with the old proc-macro2 for now, we can't use prettyplease, right?
  4. And because of the old compiler, we also can't use cargo::error=?

My take is this: Let's put as little effort as possible into anything relating to legacy versions of dependencies and the compiler. But I'd still like to see your changes happen soon regardless. So my suggestion would be that we stick to rustfmt invocation (if available) for now. And then also keep the panics in the build script as well.

Once we can move forward with the compiler, we do a big breaking change of upgrading svd2rust, proc-macro2, and at that point also introduce prettyplease and better build-script errors.

What do you think?

@tones111
Copy link
Contributor

We can't upgrade the compiler right now because of some pending regression fixes.

If I'm reading that issue correctly it sounds like most of the regressions are fixed and backported into rustc. There are a few mcus failing to pass a test suite but most are working well enough with recent nightlies. Please correct me if that's not accurate.

If that's the case couldn't we land a commit that updates the compiler version and comments out the feature flags for a bad mcus? We can repeat once the new fixes make it into rustc. Obviously we wouldn't publish a new release until support can be restored for all the processors again but this would allow us to continue making infrastructure progress while we have some momentum (instead of grinding to a halt).

Alternatively we could do that on a different branch if you're concerned about breaking main and merge it in once everything's working again.

Any thoughts on my efforts to try and break this into smaller efforts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate crate-size optimizations
3 participants